Statistical acoustic-to-articulatory mapping unified with speaker normalization based on voice conversion
نویسندگان
چکیده
This paper proposes a model of speaker-normalized acoustic-toarticulatory mapping using statistical voice conversion. A mapping function from acoustic parameters to articulatory parameters is usually developed with a single speaker’s parallel data. Hence the constructed mapping model can work appropriately only for this specific speaker, and applying this model to other speakers degrades the performance of acoustic-to-articulatory mapping. In this paper, two models of speaker conversion and acoustic-to-articulatory mapping are implemented using Gaussian Mixture Models (GMM), and by integrating these two models, we propose two methods of speaker-normalized acoustic-to-articulatory mapping. One is concatenating these models sequentially, and the other integrates the two models into a unified model, where acoustic parameters of a speaker can be converted directly to articulatory parameters of another speaker. Experiments show that both methods can improve the mapping accuracy and that the latter method works better than the former method. Especially in the case of velar stop consonants, the mapping accuracy is higher by 0.6 mm.
منابع مشابه
Speaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions
The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...
متن کاملSpeaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions
The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...
متن کاملCross-speaker Acoustic-to-Articulatory Inversion using Phone-based Trajectory HMM for Pronunciation Training
The article presents a statistical mapping approach for crossspeaker acoustic-to-articulatory inversion. The goal is to estimate the most likely articulatory trajectories for a reference speaker from the speech audio signal of another speaker. This approach is developed in the framework of our system of visual articulatory feedback developed for computer-assisted pronunciation training applicat...
متن کاملComparing Articulatory and Acoustic Strategies for Reducing Non-Native Accents
This article presents an experimental comparison of two types of techniques, articulatory and acoustic, for transforming nonnative speech to sound more native-like. Articulatory techniques use articulators from a native speaker to drive an articulatory synthesizer of the non-native speaker. These methods have a good theoretical justification, but articulatory measurements (e.g., via electromagn...
متن کاملStatistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
In this paper, we describe a statistical approach to both an articulatory-to-acoustic mapping and an acoustic-to-articulatory inversion mapping without using phonetic information. The joint probability density of an articulatory parameter and an acoustic parameter is modeled using a Gaussian mixture model (GMM) based on a parallel acoustic-articulatory speech database. We apply the GMM-based ma...
متن کامل